Metacharacter
|
Meaning
|
.
|
Matches any single character.
|
[ ]
|
Indicates a character class. Matches any character inside the brackets
(for example, [abc] matches "a", "b", and "c").
|
^
|
If this metacharacter occurs at the start of a character class, it negates
the character class. A negated character class matches any character
except those inside the brackets (for example, [^abc] matches all
characters except "a", "b", and "c").
If ^ is at the beginning of the regular expression, it matches the
beginning of the input (for example, ^[abc] will only match input that
begins with "a", "b", or "c").
|
-
|
In a character class, indicates a range of characters (for example, [0-9]
matches any of the digits "0" through "9").
|
?
|
Indicates that the preceding expression is optional: it matches once or
not at all (for example, [0-9][0-9]? matches "2" and "12").
|
+
|
Indicates that the preceding expression matches one or more times (for
example, [0-9]+ matches "1", "13", "666", and so on).
|
*
|
Indicates that the preceding expression matches zero or more times.
|
??, +?, *?
|
Non-greedy versions of ?, +, and *. These match as little as possible,
unlike the greedy versions which match as much as possible. Example:
given the input "<abc><def>", <.*?> matches "<abc>" while <.*>
matches "<abc><def>".
|
( )
|
Grouping operator. Example: (\d+,)*\d+ matches a list of numbers
separated by commas (such as "1" or "1,23,456").
|
{ }
|
Indicates a match group.
|
\
|
Escape character: interpret the next character literally (for example, [0-
9]+ matches one or more digits, but [0-9]\+ matches a digit followed by
a plus character). Also used for abbreviations (such as \a for any
alphanumeric character; see table below).
If \ is followed by a number n, it matches the nth match group (starting
from 0). Example: <{.*?}>.*?</\0> matches
"<head>Contents</head>".
|
$
|
At the end of a regular expression, this character matches the end of the
input. Example: [0-9]$ matches a digit at the end of the input.
|
|
|
Alternation operator: separates two expressions, exactly one of which
matches (for example, T|the matches "The" or "the").
|
!
|
Negation operator: the expression following ! does not match the input.
Example: a!b matches "a" not followed by "b".
|
Abbreviation
|
Matches
|
\a
|
Any alphanumeric character.
|
\b
|
White space (blank).
|
\c
|
Any alphabetic character.
|
\d
|
Any decimal digit.
|
\h
|
Any hexadecimal digit.
|
\n
|
Newline.
|
\q
|
A quoted string.
|
\w
|
A simple word.
|
\z
|
An integer.
|